On the sampling distribution of resubstitution and leave-one-out error estimators for linear classifiers

نویسندگان

  • Amin Zollanvari
  • Ulisses Braga-Neto
  • Edward R. Dougherty
چکیده

Error estimation is a problem of high current interest in many areas of application. This paper concerns the classical problem of determining the performance of error estimators in small-sample settings under a Gaussianity parametric assumption. We provide here for the first time the exact sampling distribution of the resubstitution and leave-one-out error estimators for linear discriminant analysis (LDA) in the univariate case, which is valid for any sample size and combination of parameters (including unequal variances and sample sizes for each class). In the multivariate case case, we provide a quasi-binomial approximation to the distribution of both the resubstitution and leave-one-out error estimators for LDA, under a common but otherwise arbitrary class covariance matrix, which is assumed to be known in the design of the LDA discriminant. We provide numerical examples, using both synthetic and real data, that indicate that these approximations are accurate, provided that LDA classification error is not too large.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exact performance of error estimators for discrete classifiers

Discrete Classification problems abound in pattern recognition and data mining applications. One of the most common discrete rules is the discrete histogram rule. This paper presents exact formulas for the computation of bias, variance, and RMS of the resubstitution and leave-one-out error estimators, for the discrete histogram rule. We also describe an algorithm to compute the exact probabilit...

متن کامل

Exact representation of the second-order moments for resubstitution and leave-one-out error estimation for linear discriminant analysis in the univariate heteroskedastic Gaussian model

This paper provides exact analytical expressions for the bias, variance, and RMS for the resubstitution and leave-one-out error estimators in the case of linear discriminant analysis (LDA) in the univariate heteroskedastic Gaussian model. Neither the variances nor the sample sizes for the two classes need be the same. The generality of heteroskedasticity (unequal variances) is a fundamental fea...

متن کامل

Exact Performance of CoD Estimators in Discrete Prediction

The coefficient of determination (CoD) has significant applications in genomics, for example, in the inference of gene regulatory networks. We study several CoD estimators, based upon the resubstitution, leave-one-out, cross-validation, and bootstrap error estimators. We present an exact formulation of performance metrics for the resubstitution and leave-one-out CoD estimators, assuming the dis...

متن کامل

Joint Sampling Distribution Between Actual and Estimated Classification Errors for Linear Discriminant Analysis1

Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the joint sampling distribution of the actual and estimated errors of a classification rule. The analysis presen...

متن کامل

Optimal convex error estimators for classification

A cross-validation error estimator is obtained by repeatedly leaving out some data points, deriving classifiers on the remaining points, computing errors for these classifiers on the left-out points, and then averaging these errors. The 0.632 bootstrap estimator is obtained by averaging the errors of classifiers designed from points drawn with replacement and then taking a convex combination of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 42  شماره 

صفحات  -

تاریخ انتشار 2009